Search CORE

342 research outputs found

Real-Time Audio-to-Score Alignment of Music Performances Containing Errors and Arbitrary Repeats and Skips

Author: Nakamura Eita
Nakamura Tomohiko
Sagayama Shigeki
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/12/2015
Field of study

This paper discusses real-time alignment of audio signals of music performance to the corresponding score (a.k.a. score following) which can handle tempo changes, errors and arbitrary repeats and/or skips (repeats/skips) in performances. This type of score following is particularly useful in automatic accompaniment for practices and rehearsals, where errors and repeats/skips are often made. Simple extensions of the algorithms previously proposed in the literature are not applicable in these situations for scores of practical length due to the problem of large computational complexity. To cope with this problem, we present two hidden Markov models of monophonic performance with errors and arbitrary repeats/skips, and derive efficient score-following algorithms with an assumption that the prior probability distributions of score positions before and after repeats/skips are independent from each other. We confirmed real-time operation of the algorithms with music scores of practical length (around 10000 notes) on a modern laptop and their tracking ability to the input performance within 0.7 s on average after repeats/skips in clarinet performance data. Further improvements and extension for polyphonic signals are also discussed.Comment: 12 pages, 8 figures, version accepted in IEEE/ACM Transactions on Audio, Speech, and Language Processin

arXiv.org e-Print Archive

Sampling-Frequency-Independent Universal Sound Separation

Author: Nakamura Tomohiko
Yatabe Kohei
Publication venue
Publication date: 21/09/2023
Field of study

This paper proposes a universal sound separation (USS) method capable of handling untrained sampling frequencies (SFs). The USS aims at separating arbitrary sources of different types and can be the key technique to realize a source separator that can be universally used as a preprocessor for any downstream tasks. To realize a universal source separator, there are two essential properties: universalities with respect to source types and recording conditions. The former property has been studied in the USS literature, which has greatly increased the number of source types that can be handled by a single neural network. However, the latter property (e.g., SF) has received less attention despite its necessity. Since the SF varies widely depending on the downstream tasks, the universal source separator must handle a wide variety of SFs. In this paper, to encompass the two properties, we propose an SF-independent (SFI) extension of a computationally efficient USS network, SuDoRM-RF. The proposed network uses our previously proposed SFI convolutional layers, which can handle various SFs by generating convolutional kernels in accordance with an input SF. Experiments show that signal resampling can degrade the USS performance and the proposed method works more consistently than signal-resampling-based methods for various SFs.Comment: Submitted to ICASSP202

arXiv.org e-Print Archive

Time-Domain Audio Source Separation Based on Wave-U-Net Combined with Discrete Wavelet Transform

Author: Nakamura Tomohiko
Saruwatari Hiroshi
Publication venue
Publication date: 28/01/2020
Field of study

We propose a time-domain audio source separation method using down-sampling (DS) and up-sampling (US) layers based on a discrete wavelet transform (DWT). The proposed method is based on one of the state-of-the-art deep neural networks, Wave-U-Net, which successively down-samples and up-samples feature maps. We find that this architecture resembles that of multiresolution analysis, and reveal that the DS layers of Wave-U-Net cause aliasing and may discard information useful for the separation. Although the effects of these problems may be reduced by training, to achieve a more reliable source separation method, we should design DS layers capable of overcoming the problems. With this belief, focusing on the fact that the DWT has an anti-aliasing filter and the perfect reconstruction property, we design the proposed layers. Experiments on music source separation show the efficacy of the proposed method and the importance of simultaneously considering the anti-aliasing filters and the perfect reconstruction property.Comment: 5 pages, to appear in IEEE International Conference on Acoustics, Speech, and Signal Processing 2020 (ICASSP 2020

arXiv.org e-Print Archive

Continuous negative extrathoracic pressure combined with high-frequency oscillation improves oxygenation with less impact on blood pressure than high-frequency oscillation alone in a rabbit model of surfactant depletion

Author: Hiroma Takehiko
Naito Sachie
Nakamura Tomohiko
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Negative air pressure ventilation has been used to maintain adequate functional residual capacity in patients with chronic muscular disease and to decrease transpulmonary pressure and improve cardiac output during right heart surgery. High-frequency oscillation (HFO) exerts beneficial effects on gas exchange in neonates with acute respiratory failure. We examined whether continuous negative extrathoracic pressure (CNEP) combined with HFO would be effective for treating acute respiratory failure in an animal model. Methods The effects of CNEP combined with HFO on pulmonary gas exchange and circulation were examined in a surfactant-depleted rabbit model. After induction of severe lung injury by repeated saline lung lavage, 18 adult white Japanese rabbits were randomly assigned to 3 groups: Group 1, CNEP (extra thoracic negative pressure, -10 cmH2O) with HFO (mean airway pressure (MAP), 10 cmH2O); Group 2, HFO (MAP, 10 cmH2O); and Group 3, HFO (MAP, 15 cmH2O). Physiological and blood gas data were compared among groups using analysis of variance. Results Group 1 showed significantly higher oxygenation than Group 2, and the same oxygenation with significantly higher mean blood pressure compared to Group 3. Conclusion Adequate CNEP combined with HFO improves oxygenation with less impact on blood pressure than high-frequency oscillation alone in an animal model of respiratory failure.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Effects of heliox as carrier gas on ventilation and oxygenation in an animal model of piston-type HFOV: a crossover experimental study

Author: Hiroma Takehiko
Nakamura Tomohiko
Zeynalov Bakhtiyar
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Objective This study aimed to compare gas exchange with heliox and oxygen-enriched air during piston-type high-frequency oscillatory ventilation (HFOV). We hypothesized that helium gas would improve both carbon dioxide elimination and arterial oxygenation during piston-type HFOV. Method Five rabbits were prepared and ventilated by piston-type HFOV with carrier 50% helium/oxygen (heliox50) or 50% oxygen/nitrogen (nitrogen50) gas mixture in a crossover study. Changing the gas mixture from nitrogen50 to heliox50 and back was performed five times per animal with constant ventilation parameters. Arterial blood gas, vital function and respiratory test indices were recorded. Results Compared with nitrogen50, heliox50 did not change PaCO2 when stroke volume remained constant, but significantly reduced PaCO2 after alignment of amplitude pressure. No significant changes in PaO2 were seen despite significant decreases in mean airway pressure with heliox50 compared with nitrogen50. Conclusion This study demonstrated that heliox enhances CO2 elimination and maintains oxygenation at the same amplitude but with lower airway pressure compared to air/O2 mix gas during piston-type HFOV.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Physics-informed convolutional neural network with bicubic spline interpolation for sound field estimation

Author: Koyama Shoichi
Nakamura Tomohiko
Saruwatari Hiroshi
Shigemi Kazuhide
Publication venue
Publication date: 22/07/2022
Field of study

A sound field estimation method based on a physics-informed convolutional neural network (PICNN) using spline interpolation is proposed. Most of the sound field estimation methods are based on wavefunction expansion, making the estimated function satisfy the Helmholtz equation. However, these methods rely only on physical properties; thus, they suffer from a significant deterioration of accuracy when the number of measurements is small. Recent learning-based methods based on neural networks have advantages in estimating from sparse measurements when training data are available. However, since physical properties are not taken into consideration, the estimated function can be a physically infeasible solution. We propose the application of PICNN to the sound field estimation problem by using a loss function that penalizes deviation from the Helmholtz equation. Since the output of CNN is a spatially discretized pressure distribution, it is difficult to directly evaluate the Helmholtz-equation loss function. Therefore, we incorporate bicubic spline interpolation in the PICNN framework. Experimental results indicated that accurate and physically feasible estimation from sparse measurements can be achieved with the proposed method.Comment: Accepted to International Workshop on Acoustic Signal Enhancement (IWAENC) 202

arXiv.org e-Print Archive

Functional Evaluation of Bubble CPAP for Neonates Using a Leak Model

Author: Baba Atsushi
Koike Kenichi
Nakamura Tomohiko
Saida Ken
Publication venue: 信州医学会
Publication date: 10/04/2013
Field of study

Article信州医学雑誌 61(2):65-73(2013)journal articl

Shinshu University Institutional Repository

Head-Related Transfer Function Interpolation from Spatially Sparse Measurements Using Autoencoder with Source Position Conditioning

Author: Ito Yuki
Koyama Shoichi
Nakamura Tomohiko
Saruwatari Hiroshi
Publication venue
Publication date: 22/07/2022
Field of study

We propose a method of head-related transfer function (HRTF) interpolation from sparsely measured HRTFs using an autoencoder with source position conditioning. The proposed method is drawn from an analogy between an HRTF interpolation method based on regularized linear regression (RLR) and an autoencoder. Through this analogy, we found the key feature of the RLR-based method that HRTFs are decomposed into source-position-dependent and source-position-independent factors. On the basis of this finding, we design the encoder and decoder so that their weights and biases are generated from source positions. Furthermore, we introduce an aggregation module that reduces the dependence of latent variables on source position for obtaining a source-position-independent representation of each subject. Numerical experiments show that the proposed method can work well for unseen subjects and achieve an interpolation performance with only one-eighth measurements comparable to that of the RLR-based method.Comment: Accepted to International Workshop on Acoustic Signal Enhancement (IWAENC) 202

arXiv.org e-Print Archive